Skip to content

Add cause chain to failure-isolation event (0068)#162

Merged
chris-colinsky merged 3 commits into
mainfrom
feature/0068-failure-isolation-cause-chain
Jun 17, 2026
Merged

Add cause chain to failure-isolation event (0068)#162
chris-colinsky merged 3 commits into
mainfrom
feature/0068-failure-isolation-cause-chain

Conversation

@chris-colinsky

Copy link
Copy Markdown
Member

Summary

When FailureIsolationMiddleware catches an error and degrades a node, the FailureIsolatedEvent it emits now carries the full structured cause chain alongside the existing single category and message. CaughtException gains a chain: an ordered list of CauseLink records (category, message, and a carrier flag) from the caught exception down to the originating raise, with the engine's node_exception wrapper layers flagged so consumers can skip them.

The single category / message are retained and redefined as a derivation over the chain (the outermost non-carrier link carrying a category). The derivation reproduces the prior 0065 values exactly, so the change is additive: existing consumers and the bundled OTel and Langfuse observers are unaffected. This supersedes proposal 0065's single "originating cause" representation, which was ambiguous once the post-carrier chain held more than one non-carrier link.

Adopts proposal 0068 (spec v0.57.0). First of four PRs folding the v0.14.0 consolidated spec-review findings (0068 through 0071) into the release.

Changes

  • CauseLink dataclass and CaughtException.chain field, exported from openarmature.graph.
  • FailureIsolationMiddleware builds the chain on catch and derives the single category / message from it.
  • Pinned spec advanced to v0.57.0; conformance manifest entry; changelog; middleware concept-doc update.
  • Conformance fixture 066 wired in (the pipeline-utilities runner gained a node-level-middleware translator for the fixture's node-nested shape); 4 focused unit tests covering the carrier, nested-carrier, and re-categorization cases.

Testing

  • uv run pytest tests/unit tests/conformance: 1269 passed, 280 skipped.
  • ruff, pyright, and mkdocs build --strict all clean.

FailureIsolatedEvent.caught_exception gains a structured chain: an
ordered list of CauseLink records (category, message, and a carrier
flag) from the caught exception to the originating raise, with engine
node_exception wrappers flagged. The single category and message are
retained and redefined as a derivation over the chain, reproducing the
prior 0065 values, so the change is additive: existing consumers and
the bundled OTel and Langfuse observers are unaffected.

This supersedes 0065's single originating-cause representation, which
was ambiguous when the post-carrier chain held more than one
non-carrier link. Advance the pinned spec to v0.57.0 with conformance
fixture 066 plus unit tests for the carrier, nested-carrier, and
re-categorization cases.
Copilot AI review requested due to automatic review settings June 17, 2026 18:56

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements spec proposal 0068 (spec v0.57.0) by extending failure-isolation telemetry so FailureIsolatedEvent.caught_exception includes a structured, ordered cause chain, while preserving the previously exposed derived category/message behavior for existing consumers.

Changes:

  • Added CauseLink and extended CaughtException with a chain representing the full __cause__ chain with engine carrier wrappers flagged.
  • Updated FailureIsolationMiddleware to build the chain and derive the legacy category/message from it.
  • Bumped spec pin/versioning and updated conformance harness + unit tests + docs/changelog accordingly.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/test_failure_isolation_middleware.py Adds unit tests validating chain construction and derivation behavior (carrier, nested carriers, recategorization).
tests/test_smoke.py Updates asserted __spec_version__ to 0.57.0.
tests/conformance/test_pipeline_utilities.py Extends conformance runner to translate node-nested middleware and asserts expected cause chains.
src/openarmature/graph/middleware/failure_isolation.py Builds cause chain and derives single category/message from it when emitting events.
src/openarmature/graph/events.py Introduces CauseLink; extends CaughtException with chain; updates event docs/exports.
src/openarmature/graph/init.py Re-exports CauseLink from openarmature.graph.
src/openarmature/init.py Bumps __spec_version__ to 0.57.0.
pyproject.toml Bumps [tool.openarmature].spec_version to 0.57.0.
docs/concepts/middleware.md Updates failure-isolation documentation to describe the derived fields plus full chain.
conformance.toml Advances spec_pin to v0.57.0 and records proposal 0068 as implemented.
CHANGELOG.md Adds changelog entry for proposal 0068 and updates spec-pin advancement narrative.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/openarmature/graph/events.py
Comment thread src/openarmature/graph/middleware/failure_isolation.py
The bundled agent guide embeds the spec-pin version stamps, which the
test_agents_md_drift check verifies against a fresh regeneration. The
v0.57.0 submodule bump left them reading v0.56.0; regenerate via
scripts/build_agents_md.py. Generated artifact only, no behavior
change.
`_build_cause_chain` recorded an empty-string `category` verbatim, but
`CauseLink.category` is documented as a non-empty string or None and
`_derive_cause` already treats an empty string as no-category. Coerce it
to None so the chain representation matches. No exception carries an
empty-string category in practice; addresses PR review feedback.
Copilot AI review requested due to automatic review settings June 17, 2026 19:10

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

@chris-colinsky chris-colinsky merged commit 0c8564c into main Jun 17, 2026
7 checks passed
@chris-colinsky chris-colinsky deleted the feature/0068-failure-isolation-cause-chain branch June 17, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants